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Summary 

We mutated, by gene targeting, the endogenous 
hypoxanthine phosphorlbosyl transferase (HPRT) gene 
in mouse embryo-derived stem (ES) cells. A special- 
ized construct of the neomycin resistance (neo r ) 
geVte was introduced into an exon of a cloned frag- 
ment of the Hprt gene and used to transfect ES cells. 
Among the G418 r colonies, 1/1000 were also resistant 
to the base analog 6-thioguanine (6-TG). The G418 r , 
6-TG r cells were all shown to be Hprt - as the result 
of homologous recombination with the exogenous, 
neo '-containing, Hprt sequences. We have compared 
the gene-targeting efficiencies of two classes of 
neo'-Hprt recombinant vectors: those that replace 
the endogenous sequence with the exogenous se- 
quence and those that insert the exogenous sequence 
into the endogenous sequence. The targeting effi- 
ciencies of both classes of vectors are strongly de- 
pendent upon* the extent of homology between ex- 
ogenous and endogenous sequences. The protocol 
described herein should be useful for targeting muta- 
tions into any gene. 

Introduction ^ 

Gene targeting— the homologous recombination of DNA 
sequences residing in the chromosome with newly intro- 
duced DNA sequences— provides a means for systemati- 
cally altering the mammalian genome (Smithies et al., 
1985; Thomas et al., 1986; Thomas and Capecchi, 1986). 
A desired alteration would first be introduced into a cloned 
DNA sequence, and gene targeting would then transfer 
the alteration into the genome. Gene targeting should be 
equally effective for correcting or mutating the desired 
chromosomal locus. 

We initiated our analysis of gene targeting in cultured 
mammalian cells by studying recombination between a 
defective gene residing in the chromosome and newly in- 
troduced plasmid DNA carrying a different mutation in the 
same gene. For those experiments, we first established 
cell lines containing a mutant neomycin resistance gene 
(neo r ) integrated into the genome of mouse L cells. We 
were then able to restore the gene via homologous recom- 
bination by injecting DNA carrying a different mutation in 
the neo r gene. In the course of these experiments we un- 
covered two mechanisms for altering chromosomal se- 
quences. The first involved the transfer of information, by 
homologous recombination, from the newly introduced 
DNA into the cognate chromosomal sequence (Thomas et 
al., 1986). The second involved inducing mutations in the 



homologous chromosomal sequence by what appears to 
entail incorrect repair of a heteroduplex formed between 
the newly introduced DNA and the cognate chromosomal 
sequence (Thomas and Capecchi, 1986). Each of the two 
methods has its own advantages. The transfer of informa- 
tion by homologous recombination allows one to mutate 
or correct the desired chromosomal locus in a defined 
manner. On the other hand, the frequency of altering 
chromosomal sequences by heteroduplex-induced muta- 
genesis promises to be higher than via homologous re- 
combination. This could prove to be a useful method for 
generating a large number of random mutations in spe- 
cific genes. 

In this current study we have extended our analysis of 
gene targeting by using an endogenous gene as the tar- 
get and by using embryo-derived stem (ES) cells as the 
recipient cell line. 

The target gene is hypoxanthine phosphoribosyl trans- 
ferase {Hprt). This gene was selected primarily for two rea- 
sons. First, the Hprt gene lies on the X-chromosome. 
Since ES cells derived from mate embryos are hemizy- 
gous for Hprt, only a single copy of the Hprt gene needs 
to be inactivated in order to yield a selectable phenotype. 
Second, selection procedures have been developed for 
isolating Hprt~ mutants. By far the most common path- 
way for cells in culture to become resistant to the base 
analog 6-thioguanine (6-TG) is to acquire a mutation in the 
Hprt gene (Sharp et al., 1973; Wahl et al., 1975). 

ES cells were chosen for these experiments because, 
following in activation of a chosen gene by gene targeting, 
they should provide the means to generate mice with the 
desired mutation. ES cells have been shown to be pluripo- 
tent in vitro and in vivo (Evans and Kaufman, 1981; Martin, 
1981). When reintroduced into mouse blastocystes, these 
cells contribute efficiently to the formation of chimeras, in- 
cluding contributions to a functional germ line (Bradley et 
al., 1984). In addition, it has been shown recently that 
these cells can be manipulated in vitro without losing their 
capacity to generate germ-line chimeras. Following trans- 
fection with the neo r gene and selection for G418 r , these 
ES cells were used to produce germ-line chimeras that 
stably transmitted G418 r to subsequent generations 
(Gossler et al., 1986; Robertson et al., 1986). HPRT- 
deficient mice were produced from ES cells that were ei- 
ther selected for spontaneous Hprt~ mutations (Hooper 
et al., 1987) or selected for Hprt" following the random in- 
sertion of retroviral DNA into the mouse genome (Kuehn 
et al„ 1987). 

Here we describe the site-directed inactivation of the 
endogenous Hprt gene in male ES cells by gene targeting. 
We examine some parameters that affect the gene-tar- 
geting frequency as well as the mechanism of gene inacti- 
vation mediated by different recombinant vectors. Under 
optimal conditions, we find that 1/1000 cells transformed 
by exogenous DNA can undergo a gene-targeting event. 
The advantage of inactivating specific genes via gene tar- 
geting compared with random mutagenic methods such 
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Figure 1. Disruption of the Hprt gene by Gene Targeting 
Two schemes for gene disruption, one by sequence replacement vec- 
tors and one by sequence insertion vectors, are depicted. Vectors of 
both classes contain Hprt sequences interrupted in the eighth exon 
with the neo r gene. 

(A) Sequence replacement. Sequence replacement vectors are de- 
signed such that upon linearization, the vector Hprt sequences remain 
colinear with the endogenous sequences. Following homologous pair- 
ing between vector and genomic sequences, a recombination event 
replaces the genomic sequences with the vector sequences contain- 
ing the neo r gene. 

(B) Sequence insertion. Sequence insertion vectors are designed 
such that the ends of the linearized vector lie adjacent to one another 
on the Hprt map. Pairing of these vectors with their genomic homolog, 
followed by recombination at the double strand break, results in the en- 
tire vector being inserted into the endogenous gene. This produces a 
duplication of a portion of the Hprt gene. Open boxes indicate introns; 
closed boxes indicate exons, numbered according to the map of Mel- 
ton et al. (1984); the crosshatched box indicates the neo r gene. 

as chemical mutagenesis or retroviral DNA insertion is 
twofold. First, the nature of the mutant allele is at the dis- 
cretion of the experimenter, and second, unlike random 
mutagenic events, the frequency of the targeting events is 
sufficiently high to make the procedure applicable to non- 
selectable genes. 



Results 

The Hprt gene encompasses over 33 kb of DNA and con- 
tains 9 exons that encode 1307 nucleotides of mRNA (Mel- 
ton et al., 1984). In Figure 1 we illustrate our strategies for 
inactivating the Hprt gene. The eighth exon in a cloned 
fragment of Hprt is disrupted by inserting the neo r gene. 
Following introduction of this DNA into ES cells, homolo- 
gous recombination transfers this disruption into the en- 
dogenous Hprt gene, rendering the cells neo r -Hprt" and 
therefore resistant to the drug G418 and the base analog 
6-TG. 

Using gene targeting in yeast as a paradigm (Hinnen et 
al., 1978; Orr-Weaver et al., 1981), we constructed two 
classes of vectors that we believed would disrupt the Hprt 
gene either by replacing endogenous sequences or by in- 
serting into the endogenous sequences. We termed these 
recombinant neo r -Hprt vectors replacement vectors (RV) 
and insertion vectors (IV). The mechanism of inactivating 
the endogenous Hprt gene by these two vectors is 
depicted in Figures 1A and 1B. It was of interest to deter- 
mine whether one or the other class of vectors was mdre 
efficient at targeting. Furthermore, since the end results 
using these two classes of vectors were^redicted to be 
different (note the partial duplication of the gene in Figure 
1B), each could be used to generate different mutant al- 
leles. 

Reengineering the neo r Gene 

In the schemes outlined in Figure 1 for site-specific muta- 
genesis of the Hprt gene, the neo r gene is used both to 
disrupt the coding sequence of the target gene and as a 
tag to monitor the integration of the newly introduced DNA 
into the recipient genome. Effective use of the neo r gene 
as a tag requires expression of the gene in the Hprt locus. 
In general, if the neo r gene is to be used in a similar fash- 
ion to inactivate other genes, it must be expressed in as 
many chromosomal sites as possible. ^ 

In one of our mutagenesis schemes (Figure 1A) t we re- 
quire the newly added neo r -containing sequences to con- 
vert the endogenous gene. We suspect that the frequency 
of gene conversion at the target locus may be inversely 
proportional to the length of nonhomology in the convert- 
ing sequence. This certainly appears to be the case for in- 
trachromosomal gene conversion (Letsou and Liskay, 
1987). 

Keeping the above points in mind, we have redesigned 
the neo r gene to optimize expression in ES cells while 
maintaining its size at a minimum. In Figure 2 we illustrate 
the neo r gene we have modified for this purpose. It is 
designated pMCINeo. The neomycin protein coding se- 
quence (d) is from the bacterial transposon Tn5. The pro- 
moter (b) that drives the neo F gene is derived from the 
herpes simplex virus thymidine kinase gene (HSV-dr). 
This promoter appears to be effective in embryonal carci- 
noma (EC) cells (Nicqlas and Berg, 1983; Rubenstein et 
al., 1984; Stewart et al., 1985). To increase the efficiency 
of the tk promoter, we introduced a duplication of a syn- 
thetic 65 bp fragment (a) derived from the PyF441 poly- 
oma virus enhancer. This fragment encompasses the 
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Figure 2. The neo r Gene from pMCINeo 

The structural gene and its control elements are contained on a 1 kb 
cassette flanked by an Xhol site (X) and a Sail site (S) in a pUC de- 
rivative plasmid. (a) A tandem repeat of the enhancer region from 
the polyoma mutant PYF441 consisting of bases 5210-5274 (Fuji- 
mura et al., ,1981). (b) The promoter of HSV-fir, from bases 92-218 
(McKnight, 1980). (c) A synthetic translation initiation sequence, 
GCXJAATATGGGATCGGCC. (d) The neo r structural gene from Tn5, in- 
cluding bases 1555-2347 (Beck et al., 1982). 



d)\IA sequence change that allows the polyoma mutant to 
productively infect EC cells (Linney and Donerly, 1983). Fi- 
nally, because the native neo r gene translation initiation 
signal is particularly unfavorable for mammalian transla- 
tion, a synthetic sequence (c) was substituted using Ko- 
zaKs rules as a guide (Kbzak, 1986). A series of transfec- 
tion experiments (data not shown) demonstrated that 
pMCINeo, inserted into the eighth exon of Hprt, could uti- 
lize the Hprt poly(A) addition signal. This obviated the 
need to include a poly(A) addition signal in the con- 
struction, f 

Each of the aJtove modifications was found to contribute 
additively to the transfectiorf efficiency. The contribution 
of each change" was assessed by introducing the different 
nee constructs into mouse fibroblasts (L cells) and mouse 
ES cells either by microinjection (Capecchi, 1980) or by 
electroporation and assaying for the yield of G418 r colo- 
nies (data not shown). 

In Figure 3 we illustrate parallel experiments comparing 
the transfection efficiency of three neo r vectors, pRSVNeo, 
pSV2Neo, and pMCINeo, in ES cells. The DNA was inm> 
duced by electroporation. pRSVNeo contains the neo r 
gene driven by the long terminal repeat from the avian 
Rous sarcoma virus (Hudziak et al., 1982). This promoter, 
with its accompanying enhancer, functions very efficiently 
in mouse fibroblasts (Luciw et al. f 1983) but is seen here 



to function poorly in ES cells. pSV2Neo is an SV40 pro- 
moter-enhancer-based vector {Southern and Berg, 1982) 
that appears to function moderately well in ES cells. From 
Figure 3 it is apparent that pMCINeo not only yields more 
G418 r colonies than either pRSVNeo or pSV2Neo, but 
also that the colonies are larger (i.e., the cells grow faster). 
In mouse fibroblasts, al! three vectors yield G418 r colo- 
nies at comparable ef ficiences. 

In Table 1 the transfection efficiencies of pRSVNeo, 
pSV2Neo, and pMCINeo are quantitatively compared. 
From these data, it is apparent that pMCINeo yields 300- 
to 800-fold and 25- to 50-fold more G418 r colonies than 
pRSVNeo or pSV2Neo, respectively. One interpretation of 
these results is that, since these vectors are inserting ran- 
domly into the mouse genome, the transfection efficien- 
cies reflect the relative number of integration sites within 
the genome that are compatible with sufficient neo r gene 
expression to yield G418 r colonies. The higher transfec- 
tion efficiency of pMCINeo may prove critical when at- 
tempting to use neo r as a tag for targeting into genes that 
are either expressed at low levels, such as Hprt, or not at 
all. We have never observed G418 r colonies following 
mock transfections of ES cells (see Table 1). 

Electroporation 

We have used electroporation to introduce the neo r -Hprt 
recombinant vectors into ES cells. The conditions for elec- 
troporation (Neumann et al., 1982; Potter et al. t 1984; Chu 
et al., 1987) are described in Experimental Procedures. 
Under our optimal transfection conditions, 40%-60% of 
the cells survived electroporation and approximately 
1/1000 surviving ES cells became G418 r . The conditions 
for electroporation were further chosen to yield predomi- 
nantly single copy integrants. Of a dozen G418 r cell lines 
analyzed by Southern transfer, each contained a single 
copy of pMCINeo integrated in the mouse genome (data 
not shown). 

Recombinant Vectors 

In Figure 4, we illustrate the vectors used to inactivate the 
Hprt gene. The vectors contain sequences from the 3' por- 
tion of the mouse Hprt gene cloned into a pUC9 plasmid. 
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Figure 3. G418 r ES Cells Obtained by Trans- 
fection with pRSVNeo, pSV2Neo, or pMCINeo 
ES cells were transfected by electroporation 
with 25 ug/ml of the respective, linearized 
recombinant neo r vector. The conditions for 
electroporation, cell culture, and selection for 
G4l8 r colonies are described in Experimental 
Procedures. Ten days following electropora- 
tion, the cells were fixed with cold methanol 
and stained with Giemsa. pMCINeo does not 
contain its own poly(A) addition signal (see 
text). To evaluate its transfection efficiency, ei- 
ther a synthetic po'y(A) addition signal de- 
rived from HSV-f/c, nucleotides 1481-1530 
(McKnight, 1980; Zhang et al., 1986), or a frag- 
ment from the mouse Hprt gene (Seal site in 
exon 8 to the Bglll site 3' to the end of the gene) 
was added. 
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Table 1 . Efficiency of Transfection 



Vector 



0 

pRSVNeo 
pSV2Neo 
pMClNeo 



Exp. 



1 
2 
1 
2 
1 
2 
1 
2 



No. of Cells Surviving 
Electroporation 



9.0 x 10 7 

9.4 x 10 7 

8.5 x 10 7 
6.3 x 10 7 
8.7 x 10 7 
9.3 x 10 7 
6.0 x 10 7 
7.2 x 10 7 



No. of G418 r 
Colonies 



0 
0 

1.2 x 

2.0 x 

2.1 x 

4.2 x 
7.5 x 



10 2 
10 2 
10 3 
10 3 
10 4 



6.8 x 10 4 



Frequency of 
G418 r Colonies 



0 
0 

1.4 x 

3.2 x 

2.4 x 

4.5 x 
1.25 x 



10-« 

10- 6 

10" 5 
10" 5 
10" 3 



0.94 x 10" 3 



ES cells were transfected by electroporation with 25 ng/ml of either linearized nRQvw^ ~c\/oju 

lion, cell culture, and se.ection for G418' co.onies are de^C^ 0r P MC1N °°- editions for electropora- 



In all vectors, the 1 kb neo r cassette from pMClNeo has 
been inserted into the eighth exon of the Hprt gene. To 
minimize the extent of nonhomology between the endoge- 
nous Hprt gene and the newly introduced DNA, se- 
quences required for growth of the recombinant vector in 
bacteria were removed prior to introduction into ES cells. 
In the process of removing these sequences, the recom- 
binant vector is converted to linear DNA that, compared 
with supercoiled DNA, is a better substrate for gene tar- 
geting (Thomas et al., 1986). As discussed previously 
these vectors fall into two classes, sequence replacement 
vectors and sequence insertion vectors, based on the 
predicted mode of targeting (see Figure 1). 
Sequence replacement vectors were designed such 
- upon linearization, the vector Hprt sequences would 
m colinear with the endogenous Hprt sequences In 
words, the 5' and 3' ends of the vector would cor- 
nd to the 5' and 3' extents of sequence homology 
the endogenous gene (see Figure 1A). Three differ- 
^ J^TT* rep,acement vectors were used in this 
JuJMII of them contain a common 3' endpoint, 2.8 kb 
^ream from the site of the neo' gene insertion, but 
;^«tf»tongth of ^sequences 5' from the insertion 
The total length of Hprt homology in vectors pRV4.0; 

and PRV9.1, is 4.0, 5.4, and 9.1 kb, respectively 
^alternate class of vectors, the sequence insertion 

S£TT i 9ned such that the separation of the 

«M» jmm the P UC9 p.asmids concomitantly ere- 

^5 a^ ^ T?^ Within the se «. 

. °' th6Se vectors tnus ,ie Adjacent to 
one another along the Hprt map (see Figure IB) Two se 

£w and ^IV9 A the endpoints of both linearized vec- 

^iTkbo?Hri exon 8 Bo,h vectors contain *• 

Srttfil in th^ 8 «« uen «» 3' ft«n the neo insertion, 
£ SiT °' h ° m0,09y at ,he 5 ' si *> * the. 

^ZT^ 9 ^ Rep,acement Actors 
"CXlJe^^r reP ' aCement V6ctore «» 

»>••• crtte o k elec,ro P<>ration. Aliquots of 

^subjected to one of three growtS condi 
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Figure 4. Targeting Vectors 

The vectors were constructed as described in Experimental Proce- 

.rX™* HOI. ^ rePrBSent Hprt ex0ns - numbered accordi "9 
to the map of Melton et al. (1984). Open boxes represent introns and 

3-noncod.ng sequences. The crosshatched box represents the neo' 

gene from pMClNeo inserted onto the eighth exon of Hprt. The ends 

o each sequence on the diagram correspond to the site of insertion 

o each sequence into a pUCWerivatized plasmid. Digestion of the 

plasmids wrth the appropriate restriction endonuclease released the 

ZL^T^T *" PlaSmid ' Crea,in9 *• vectors 
dep^ed above. The length of the Hprt sequences in each vector is as 
to tows: P RV4.0, 4 kb; pRV5. 4 , SA kb; pRV9.1, 9.1 kb; „IVa7 3.7 ktr 
.h» •!! ^,y ec,ors contain ,he POWA) addition sequences from' 
L ™' re P ,acemen « ^or; IV, insertion vector; Bg, Bglll- 

Bs BstEII; E, EcoRI; the dotted line in each pIV designates the point 
of d.scontin U .ty of the Hprt gene due to the joining of the 5' and 3' ends 
dunng vector construction. Bg'. the internal Bglll site in pRV9.1, was 

^tI^ 9 *" 96 89,11 " nd fillin9 in with Weno * '«9ment 
and dNTPs This perm.ts the excision of the replacement vector using 
the terminal Bglll sites. ^ 



tions: nonselective media, to assess the total number of 
cells surviving electroporation; G418 media, to assay the 
fraction of survivors transformed by the neo'-containing 
vectors; and G418, 6-TG media, to select for cells simul- 
taneously containing the neo r gene but lacking a func- 
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Table 2. Gene Targeting Using Sequence Replacement Vectors 


Vector Exp. 


No. of Cells Surviving 
Electroporation 


No. of G418 r 
Colonies 


G418 r + 6-TG r 

No. Of G418 r + 

6-TG r Colonies G418 r 


pRV4.0 \ 

pRV5.4 1 
pRV9.1 1 


5.3 x 10 7 
4.3 x 10 7 
11.0 x 10 7 
7.8 x 10 7 


8.1 x 10 4 
4.3 x 10 4 
6.9 x 10 4 
3.0 x 10 4 


2 1/40,000 
2 1/21,500 
10 1/6,900 
32 1/950 


ES cells were transfected with 25 ng/ml of linearized pRV4.0, pRV5.4, or pRV9.l. The conditions for electroporation, cell culture, selection for 
G418 r cell lines, and selection for G418', 6-TG r cell lines are described in Experimental Procedures. 



tional Hprt gene. As discussed below, in cells showing the 
G418 r , 6TC r phenotype, the endogenous Hprt gene was 
inactivated by the targeted replacement of the endoge- 
nous sequence with the neo'-recombinant sequence, 

In Table 2, we summarize the ability of the three differ- 
ent sequence replacement vectors to confer G418 r and 
G418 r , 6-TC r upon ES cells. Although the three vectors 
transform ES cells to G418 r at a similar frequency 
(~1/1000), there is a marked difference in their ability to 
generate G418 r , 6TC r colonies. Of the G418 r cells trans- 
formed with the smallest vector, pRV4.0, only 1/40,000 to 
1/20,000 showed resistance to 6-TG. However, in those 
cells transformed to G418 r by the largest vector, pRV9.1, 
1/950 also showed the 6-TG r phenotype. Transformation 
by the intermediate-sized vector, pRV5.4, gave an inter- 
mediate frequency of 6-TG ! resistance, with 1/7,000 G418 r 
colonies showing the 6TC r phenotype. 

To show that the G418 r , 6-TG r phenotypes were the re- 
sult of gene-targeting events, the Hprt genes from 23 in- 
dependently isolated G418 r , 6TC r cell lines were charac- 
terized by Southern transfer analysis. In every instance 
(23/23) the cells were shown tp contain a single copy of 
the Hprt gene harboring the neo r gene in exon 8. This re- 
sult was seen in cells transformed with either pRV4.0, 
pRV5.4, or pRV9.1. Also, as expected, Hprt enzymatic ac- 
tivity could not be detected in these cell lines. As judged 
by their ability to incorporate [ 3 H]hypoxanthine into their 
nucleic acid, these cells contained 10 3 - to lOMold less 
activity than the parental ES cell line (data not shown). 

An example of the Southern transfer analysis is shown 
in Figure 5A, in which the G418 r , 6-TG r cell line EP17-2M 
is compared with the parental ES cell line. DNA from each 
line was digested with the enzymes, Bglll, EcoRI, or Bglll 
plus EcoRI, electrophoresed in agarose, and transferred 
to nitrocellulose paper. The paper was then hybridized 
with radiolabeled DNA containing 1 kb of Hprt sequence 
(Figure 5A, probe A). 

As predicted from the restriction map of the cloned Hprt 
gene (see Figure 5A), digestion of the ES DNA with Bglll, 
EcoRI, or Bglll plus EcoRI isolates sequences homolo- 
gous to the Hprt probe on fragments of lengths 5.4 kb f 9.3 
kb, or &7 kb, respectively. The digestion pattern of the 
DNA from the G418 r , 6-TG' cell line is quite different, 
showing fragments of 6.4 kb, 8.3 kb, and 2.7 kb. As illus- 
trated in Figure 5C, this pattern would exist if the endoge- 
nous Hprt gene had been replaced by vector sequences 
containing the neo r gene insertion. Because the neo r 
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Figure 5. Southern Transfer Demonstration of Sequence Replacement 
DNA was purified from each cell line and digested with restriction en- 
donuclease. DNA (7 pg) was loaded onto an agarose gel, electropho- 
resed, transferred to nitrocellulose, and hybridized to ^P-labeled 
DNA probes. ES refers to DNA from the parental, wild-type ES cell line. 
17-2M refers to DNA from a G418 f , 6-TG r cell line transformed with the 
replacement vector, pRV4.0. (A) Probe A was a 1 kb fragment of mouse 
Hprt DNA extending from the BstEII site in intron 6 to the Seal site in 
exon a (B) Probe B was the 1200 bp Ddel fragment of the neo r struc- 
tural gene from the plasmid, pRH140 {Thomas and Capecchi, 1986). 
The lengths of the fragments are given in kb and were determined by 
the coelectrophoresis of X and plasmid fragments of known lengths. 
(C) A schematic representation of the Southern transfer data. The top 
map represents the 3' end of the Hprt gene from ES cells; the bottom 
represents the Hprt gene from the 17-2M cell line. Open boxes repre- 
sent introns, closed boxes represent exons, and the crosshatched box 
represents the neo r gene. Beneath each gene is shown the restriction 
fragments hybridizing to the probes. Bg, Bglll; E, EcoRI. 

gene insert has no Bglll site, the size of the Bglll fragment 
is increased by the size of the insert (1 kb). However, be- 
cause the neo r gene does contain an EcoRI site, its pres- 
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Table 3. Gene Targeting Using Sequence Insertion Vectors 


Vector Exp. 


No. of Cells Surviving 
Electroporation 


No. of G418 f 
Colonies 


t ~*.«r G 41 8 r + 6-TG r 
No. of G418 r + 

6-TG f Colonies G418 r 


PIV3.7 1 
PIV9.3 * 


8.1 x 10 7 
0.74 x 10 7 
4.1 x 10 7 


5.7 x 10 4 
0.42 x 10 4 
2.25 x 10 4 


3 1/19,000 
3 1/1,400 
21 1/1,100 


ES cells were transfected by electroporation with 25 ng/ml of linearized plV3.7 or plV9.3. The conditions for electroporation, ceil culture, selection 
for G418 r cells, and selection for G418 r , 6-TG r cells are described in Experimental Procedures. 



ence introduces a new EcoRI site into the Hprt gene, 
resulting in the production of a smaller EcoRI fragment. 

The interpretation is further verified when the same 
DNAs are hybridized to sequences from the neo r gene 
(Figures 5B and 5C, probe B). As expected, the parental 
cell line contains no neo r homology. The G418 r , 6TC r 
derivative does show neo r homology at a site within the 
Hprt locus. Digestion of the DNA with Bglll isolates the 
neo r gene on the same 6.4 kb fragment homologous to 
the Hprt probe. Because the neo r gene contains an 
EcoRI site at its 5' end, digestion with this enzyme 
separates the neo r gene from sequences homologous to 
the Hprt probe and thus creates a 2 kb fragment with 
neo r homology. 

It should be noted that the G418 r , 6-TG r cell line EP17- 
2M was isolated following transformation with pRV4.0. Al- 
though this vector lacks both the EcoRI and the Bglll sites 
5' to the neo r insertion site (see Figure 4), the cell line 
EP17-2M clearly has both sites at the predicted distance 
from the neo r gene. Such a positioning of two restriction 
sites is best explained by a targeted recombination event. 

Gene Targeting with Sequence Insertion Vectors 

The two sequence insertion vectors were linearized and 
introduced into ES cells by electroporation. These cells 
were then scored for total survivors, G418 r survivors, and 
G418 r , 6-TG r survivors. The results of these experiments 
are summarized in Table 3. The two vectors were equally 
competent in the ability to confer G418 r resistance upon 
ES cells, but differed markedly in their ability to generate 
G418 r , 6-TG r colonies. Whereas the smaller vector, plV3.7, 
generates 6-7G r cells at a frequency of 1/20,000 G418 r 
cells, the larger, plV9.3, induces 6-TG r resistance at a fre- 
quency of 1/1,100 to 1/1,400 G418 r cells. In all cases, the 
G418 r , 6TC r cells contained targeted mutations of their 
Hprt loci. 

To show that gene targeting was responsible for gener- 
ating the G418 r , 6-TC r phenotype, we analyzed by South- ✓ 
ern transfer analysis 12 cell lines transformed by plV9.3. 
Unlike the case of the sequence replacement vectors in 
which all Hprt mutations were caused by the same type 
of event, in activation of the Hprt gene by sequence inser- 
tion vectors occurred by two mechanisms. The majority of 
targeting events caused by plV9.3 (9/12) were due to the 
insertion of the entire vector into the endogenous Hprt lo- 
cus. The remaining targeting events were sequence re- 
placements, resembling those events induced by the se- 
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Figure 6. Southern Transfer Demonstration of Sequence Insertion 
Analysis was as described in Figure 5. ES is the parental cell line. 18-1F 
is a G418 r , 6-TG r cell line transformed with the insertion vector plVS|£ 
(A) Hybridization with probe A containing 1 kb of Hprt sequences from 
the BstEII site in intron 6 to the Seal site in exon a (B) Hybridization 
with probe B, the neo r gene. (C) A schematic representation of the 
data. The top map represents Hprt sequences from the ES cell line. 
The bottom map represents sequences from the cell line 18-1 F. The ob- 
served restriction fragments and their lengths are shown beneath each 
map. Bg, Bglll; E, EcoRI. 



quence replacement vectors. Examples of each event are 
shown in Figures 6 and 7. 

In Figure 6 we show the Southern transfer pattern of cell 
line EP18-IF, a G418 r , 6-TC r cell line transformed with 
PIV9.3. DNA from this cell line and DNA from the parental 
ES line were digested with Bglll, EcoRI, or Bglll plus 
EcoRI and probed with labeled Hprt sequences (Figure 
6A, probe A). DNA from the parental cell line shows the 
5.4, 9.3, and 3.7 kb fragments diagnostic of the wild-type 
Hprt gene. DNA from the G418 r , 6TO r cell line contains 
these same fragments, but also contains fragment of 6.4, 
8.3, and 2.7 kb. These later fragments are characteristic of 
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Figure 7. Southern Transfer Demonstration of Sequence Replace- 
ment Induced by a Sequence Insertion Vector 
Analysis was as described in Figure 5. ES is the parental cell line; 18- 
2N is a G418 r , 6-7G r cell line transformed with the insertion vector 
plV9.3. The probe was the 1 kb Hprt fragment from the BstEli site in 
intron 6 to the Seal site in exon 8 (probe A, Figure 5C). Bg, Bglll; E, 
EcoRI. Note that the hybridization pattern is identical to that of cell line 
17-2M (Figure^). 7 " 

V 

the Hprt gene containing the neo r gene in exon 8. One 
likely mechanism that would result in both fragments be- 
ing recovered from the same cell is shown in Figure 6C. 
If the entire vector, plV9.3 is inserted into the Hprt locus 
via homologous recombination fix will cause a 9.3 kb dupli- 
cation of the Hprt sequences. The most 5' duplicated re- 
gion will contain the neo r gene, whereas the most 3' 
duplicated region will contain wild-type sequences. Re- 
striction enzyme digestions of this DNA will thus produce 
the hybrid configuration seen. This interpretation is fur- 
ther confirmed when the DNA from such a cell line is also 
probed with neo r sequences. As shown in Figure 6B, only 
1 copy of the duplicated region contains neo r homology. 

In 3/12 cell lines examined by Southern transfer analy- 
sis, this insertion pattern was not seen. Instead, the 
endogenous Hprt sequences appeared to have been 
replaced by the vector sequences containing the neo r in- 
sert. An example of one such cell line, EP18-2N, trans- 
formed by plV9.3, is shown in Figure 7 DNA from this cell 
line and DNA from the parental ES line were digested with 
Bglll, EcoRI, or Bglll plus EcoRI and probed with Hprt se- 
quences (Figure 7). The restriction pattern generated from 
this DNA is indistinguishable from that generated by 
digestion of DNA from cell line EP17-2M (see Figure 5A), 
a cell line transformed by a sequence replacement vector. 
Thus, sequence insertion vectors are also substrates for 
the sequence replacement reaction. 

In examining the DNA products of the sequence re- 
placement reactions, a low level of Hprt sequences lack- 
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Figure 8. Southern Transfer Analysis of DNAs Digested with Hindill 
DNAs were digested with Hindill and analyzed as described in Figure 
5. ES is the parental cell line; 18-1 F is a G418 r , 6TT3 r cell line trans- 
formed by sequence insertion; 17-2M is a G418 r , 6-TG r cell line trans- 
formed by sequence replacement. (A) Cellular DNA was probed with 
Hprt probe A. (B) A diagram of the 3' region of the Hprt genes from the 
three cell lines analyzed. Bg, Bglll; E, EcoRI; H, Hindill. 



ing the neo r sequences is detected (see Figure 5A; Fig- 
ure 7). These endogenous-length Hprt sequences come 
from the feeder cells upon which the ES cells are grown. 
The feeder cells are nonproliferating due to pretreatment 
with mitomycin C and represent a minor component of the 
total number of cells on the plate. However, the presence 
of such contaminating Hprt sequences presents a prob- 
lem in our analysis of the sequence insertion events. In 
these latter events, the neo r -containing vector sequences 
lie adjacent to the endogenous sequences, and hybridiza- 
tion analysis revealed the presence of these two Hprt se- 
quences in equal proportions (Figure 6A). We felt com- 
pelled to demonstrate that the endogenous-length Hprt 
copy detected in this Southern transfer was in fact adja- 
cent to the inserted copy and not the result of feeder cell 
contamination. 

To do this, DNA was digested with the restriction en- 
donuclease Hindill and probed with Hprt sequence. Such 
a digestion permits a distinction to be made between a 
single copy endogenous sequence and a sequence adja- 
cent to the inserted vector. As shown in Figure 8, the en- 
dogenous Hprt sequences, represented by the parental 
ES cell line, are contained on an 11.4 kb fragment. Se- 
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quences containing a single copy of the Hprt gene dis- 
rupted by the neo r gene, from cell line EP17-2M, are iso- 
lated on a 12.4 kb band. Digestion of DNA from the cell 
line EP18-1F, transformed by a sequence insertion vector, 
gives two Hprt fragments of equal intensity, one of 12.4 kb 
and one of 9.3 kb. This digestion pattern is quite consistent 
with a sequence insertion event. 

Discussion 

We have analyzed 38 independent G418 r , 6-7G r ES cell 
tines. Each of these cell lines was shown to have arisen 
from inactivation of the endogenous Hprt gene by homolo- 
gous recombination with the introduced qeo'-Hprt frag- 
ment. Spontaneous formation of G418 r , CTG r cells was 
not detected (i.e., occurs at less than 1/1 0 9 cells per 
generation). 

None of the G418 r , 6-TG r cell lines contained extrane- 
ous copies of the neo r -Hprt recombinant vector in- 
tegrated randomly within the ES genome. This greatly 
simplified the analysis of the targeting events in these cell 
lines and permitted an unambiguous interpretation of the 
results. The absence of extraneous copies of the input 
vector in the targeted cells should also simplify interpreta- 
tion of additional studies that will entail establishing a 
correlation between the inactivation of specific genes with 
the resultant phenotypes. 

Under our optimal conditions, we have observed a 
gene-targeting frequency, relative to the frequency of ran- 
dom integration of the input vector, of 1/1000. The param- 
eters that we believe influenced the success of these ex- 
periments include using a neo r gene that is efficiently 
expressed in ES cells, maintaining the size of the neo r 
gene at a minimum, using extensive homology between 
the homing sequence and the target sequence, and re- 
moving, prior to transfection, unnecessary and nonhomol- 
ogous sequences from the input vector. 

This gene-targeting frequency is sufficiently high to be 
used for inactivating nonselectable genes. Direct screen- 
ing, by Southern transfer analysis, for a gene-targeting 
event among 1000 candidate cell lines would not be exor- 
bitant. Furthermore, gene-targeting enrichment proce- 
dures could be added to the protocol for using the neo r 
gene as a transfection tag. For example, a neo r gene 
lacking an enhancer or a poly(A) addition signal could be 
positioned within the homing sequence in such a way that 
homologous recombination with the target gene would 
juxtapose the defective neo r gene with the sequences re- 
quired for effective expression. Random integration of the 
same vector into the recipient genome would not normally 
bring the required sequence sufficiently near the neo r ' 
gene to yield G418 r colonies. Pilot experiments testing 
such procedures indicate that enrichment of several hun- 
dred fold for gene targeting compared with random in- 
tegration should be attainable (unpublished results). 

The gene-targeting frequency was observed to be very 
sensitive to the extent of homology between the exoge- 
nous and cognate endogenous sequence. A 2-fold in- 
crease in homology increased the gene-targeting fre- 
quency by 20-fold. Further increases in the extent of 



homology may increase the gene-targeting frequency 
even more. 

We have compared two classes of neo r ~Hprt recom- 
binant vectors, one that replaces endogenous sequences 
with exogenous sequences and another that inserts exog- 
enous sequences into the endogenous sequence. Both 
classes exhibit comparable gene-targeting frequencies 
and are equally sensitive to the extent of homology with 
the endogenous target. We have termed the former se- 
quence replacement vectors and the latter sequence in- 
sertion vectors. In 23/23 G418 r , 6TC r cell lines obtained 
by introducing the replacement vector, the endogenous 
Hprt gene was inactivated by sequence replacement. Of 
the 12 G418 r , 6-TG r cell lines obtained with the insertion 
vector plV9.3, 9 resulted from sequence insertion. In the 
remaining 3 the Hprt gene was inactivated by sequence 
replacement. The latter may result from a crossover oc- 
curring at points within the vector sequences rather than 
at both termini (Szostak et al., 1983). Though insertion 
vectors mediate mutagenesis via two pathways, they tar- 
get predominantly by inserting into the endogenous gene. 

The insertion vectors are technically more difficult to 
build. On the other hand, they may provide the means for 
generating a wider spectrum of mutant alleles. For exam- 
ple, by placing the neo r gene in the 3 -untranslated se- 
quence, it can still be used as a transfection tag. In such 
a vector, the neo r gene could be linked tp a wide spec- 
trum of mutations, including point mutations, small inser- 
tions, or small deletions, in upstream exons. In the pro- 
cess of insertion of the neo r vector, these mutations 
would be concomitantly transferred into the endogenous 
gene. 

When we initiated these experiments we had two con- 
cerns about using the Hprt gene as our target: it is ex- 
pressed at a low level in ES cells, and it contains many 
repetitive DNA sequences. As in most cells, HPRT protein 
represents approximately 1/5,000 of the soluble protein 
(Hughes et al., 1975). Furthermore, repetitive DNA se- 
quences are dispersed throughout both the neo r -Hprt 
recombinant vector and the Hprt gene. In fact, it is no* a 
simple tasklo identify a suitable probe from the Hprt locus 
for Southern transfer analysis. The success obtained in 
targeting to the Hprt gene despite these handicaps may 
indicate that in the future we need not be so concerned 
with these parameters negatively influencing, the gene- 
targeting frequency. 

Conclusion 

We have demonstrated that we can inactivate by gene tar- 
geting a specific locus in the mouse genoma The protocol 
we have developed to inactivate the endogenous Hprt 
gene should be adaptable to other genes as well. We have 
also shown that ES cells are a suitable host for gene- 
targeting experiments, it is hoped that this combination of 
using ES cells as the recipient cell line and site-specific 
mutagenesis achieved by gene targeting will provide the 
means for generating mice of any desired genotype. An 
advantage of this scenario is that the first generation chi- 
mera will usually be heterozygous for the targeted muta- 
tion and that subsequent breeding can be used to gener- 
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ate the homozygous animal. Thus, only one of the two loci 
need be inactivated, and recessive lethals can be main- 
tained as heterazygotes. If successful, this technology will 
be used in the future to dissect the developmental path- 
way of the mouse as well as to generate mouse models 
for human genetic diseases. 

Experimental Procedures 

Vector Construction 

Hprt sequences were isolated from a X, Charon 4A, library containing 
a partial EcoRI-digest of DNA from a mouse ARK cell line (the library 
was provided by Doug Foster, Ohio State University). The library was 
screened with a human cDNA Hprt probe (courtesy of C. Thomas 
Caskey). A recombinant phage containing the 9.3 EcoRI fragment en- 
coding Hprt exons 6-9, as well as the 2.2 and 1.0 kb fragments 3' of 
the Hprt gene, was isolated. The 93 kb and 2.2 kb fragments were sub- 
cloned into pUC9 and converted by standard cloning methods into the 
targeting vectors. To introduce the neV gene into Hprt exon 8, an 
6 bp Xhol linker (New England Biolabs) was ligated into the Seal site 
in exon 8. 

The neo r vector pMCINeo was created by the sequential ligation of 
its four functional domains (see Figure 2) into pUC9. The polyoma en- 
hancer sequences were chemically synthesized on an Applied Bio- 
systems model 380B DNA synthesizer and were flanked with Xhol (5' 
end) and Sail (3' end) restriction sites. To create the enhancer dinner 
used in pMCINeo, the monomer units were ligated in vitro and the 
dimer was purified from a potyacrylamide gel. The HSV-tfr promoter se- 
quence from bases 92 to 121 was chemically synthesized and ligated 
in vitro to bases 122-218, isolated as an EcoRI-Pstl fragment from the 
HSV-tfc gene. The translational start sequence was synthesized chemi- 
cally. The neo r gene was derived from the transposon Tn5. The struc- 
ture of pMCINeo was confirmed by DNA sequence analysis. pMCINeo 
was designed such that the neo r gene and all its control elements 
could be removedfcs a 1 kb unit following digestion by Xhol and Sail 
and thus insertedtato the Xhol site in the Hprt gene of the various tar- 
geting vectors- 
Southern transfer analysis was performed as described previously 
(Thomas et al., 1986). 

Isolation and Culturing of ES Cells 

ES cells were isolated from C57B1/6 blastocysts as described by Evans 
and Kaufman (1981) except that primary embryonic fibroblasts 
(Doetschman et aJ., 1985) were used as feeders rather than STO cells. 
Briefly, 25 days postpregnancy mice were ovariectomized, and de- 
layed blastocysts were recovered 4-6 days later. The blastocysts were 
cultured on mitomycin C-inactivated primary embryonic fibroblasts. 
After blastocyst attachment and the outgrowth of the trophectoderm, 
the ICM-derrved clump was picked and dispersed by trypsin into 
clumps of 3-4 cells and put onto new feeders. All culturing was carried 
out in DMEM plus 20% FCS and 10~* M f*-mercaptoethanol. The cul- 
tures were examined daily. After 6-7 days in culture, colonies that still 
resembled ES cells were picked, dispersed into single cells, and 
replated on feeders. Those cell lines that retained the morphology and 
growth characteristic of ES cells were tested for pluripotency in vitro. 
These cell lines were maintained on feeders and transferred every 2-3 
days. For comparative purposes we have also used ES cell lines kindly 
provided by Martin Evans and Gail Martin. Cell lines from all three 
sources yielded targeted G4l8 r , 6-TG f colonies at comparable fre- 
quencies. The G418 r , 6-TG r cell lines are morphologically indistin- 
guishable from the parental ES cells and retained their pluripotency in 
vitro (i.e., differentiate when grown on petri plates in the absence of a 
feeder layer and form embroid bodies when grown in suspension). 

Electroporatlon and Isolation of G418 r and G418 r , 6-TG r Cell Lines 
DNA was introduced into the ES cells by electroporation using the 
Promega Biotech X Cell 2000. Rapidly growing cells were trypsinized, 
washed in DMEM, counted, and resuspended in buffer containing 20 
mM HEPES (pH 70), 0-37 mM NaCI, 5 mM KCI, 0.7 mM Na 2 HP0 4 , 6 
mM dextrose, and 0.1 mM ft-mercaptoethanol . Just prior to electropora- 
tion, the linearized recombinant vector was added. The cells were then 
exposed to a single, 625 V7cm pulse at room temperature, allowed to 



remain in the buffer for 10 min, and plated onto feeder cells. For every 
experiment, aliquots of cells were removed before and after electropo- 
ration to measure colony-forming units; 40%-60% of the cells survived 
electroporation. 

In a typical experiment 10 7 cells per vial were transacted by elec- 
troporation with 25 ng/ml of linearized vector. Aliquots of cells were 
then subjected to one of three growth conditions: nonselective media, 
to evaluate the number of cells surviving electroporation; G418 media, 
to determine the fraction of survivors transformed by the neo r vector; 
and G418, 6-TG media, to select cells that had simultaneously acquired 
a neo r gene and lost a functional Hprt gene. For these experiments 
the ES cells were grown on mitomycin C-inactivated STO cells (ob- 
tained from Alan Bradley). To ensure inactivation, the STO cells were 
treated with 10 jig/ml mitomycin C for 4 nr. Survival was less than 
1/10 9 cells. 

ES cells that were to be grown on nonselective medium were diluted 
4 x lOMold and 8 x lOMold prior to plating onto 100 mm dishes con- 
taining the feeder cells. Cells that were to be subjected to growth on 
G418 medium or G418, 6-TG medium were diluted 200-fold and 26-fold, 
respectively, before plating onto feeders. To allow for expression of the 
neo r gene, the cells were first plated in nonselective medium and, 48 
hr later, were transferred to G418-containing medium (250 ng/ml). To 
allow for the decay of the endogenous HPRT activity, the cells that 
would eventually be subjected to G418, 6-TG selection were first plated 
in nonselective medium. Two days later they were transferred to G418 
medium, and 5 days after electroporation they were transferred to 
G418, 6-TG (1 ng/ml) medium. 

At each transfer, the cells were trypsinized and placed on a new 
feeder plate in their respective medium. It is necessary to disperse the 
cells prior to subjecting them to selection because ES cells grow in 
tight clumps and cross-feed extensively. During this period of selec- 
tion, the cells are dividing and it is necessary, for quantitative analysis, 
to keep track of the number of cell divisions. For this purpose, aliquots 
of cells from the same experiment were grown in nonselective me- 
dium, subjected to the same transfer protocol, and used to measure 
cell proliferation during this period. An outcome of the above protocol 
is that if a single targeting event occurs during the time of electropora- 
tion, one of the G418, 6-TG plates will yield a burst of 32-64 G418 r , 
6-TG r colonies; the rest of the plates will contain no colonies. This is 
scored as a single event. Each of the G418 r , 6-TG r cell lines that we 
obtained came from such individual bursts, indicating that the target- 
ing events occurred at the time of electroporation. The above scoring 
procedure underestimates the gene-targeting frequency since a burst 
of G418', 6TG r colonies on a given plate may have arisen from more 
than one event. For example, if the Poisson function was used to cor- 
rect the pRV9.l data shown in Table 2 for the expected number of plates 
that resulted from two events, then the gene-targeting frequency would 
be 1/800 G418 r colonies rather than 1/950 G418 r colonies. 
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